A Minimal Route to Transformer Attention
neelsomaniblog.com·2d·
Discuss: Hacker News
🧩Attention Kernels
Flag this post
[D] Best (free) courses on neural networks
reddit.com·3h·
🧩Attention Kernels
Flag this post
Neural bases of sustained attention during naturalistic parent-infant interactions
nature.com·1d
🧩Attention Kernels
Flag this post
Everything About Transformers
krupadave.com·2d
🧩Attention Kernels
Flag this post
Your Transformer is Secretly an EOT Solver
elonlit.com·1d·
Discuss: Hacker News
Flash Attention
Flag this post
GIR-Bench: Versatile Benchmark for Generating Images with Reasoning
paperium.net·1d·
Discuss: DEV
🏎️TensorRT
Flag this post
Minimax pre-training lead explains why no linear attention
reddit.com·2d·
Discuss: r/LocalLLaMA
Flash Attention
Flag this post
Clarity From Chaos: AI Super-Resolution Redefined
dev.to·12h·
Discuss: DEV
Flash Attention
Flag this post
Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
arxiv.org·1d
🧮cuDNN
Flag this post
RF-DETR Under the Hood: The Insights of a Real-Time Transformer Detection
towardsdatascience.com·1d
🏎️TensorRT
Flag this post
Show HN: Hot or Slop – Visual Turing test on how well humans detect AI images
hotorslop.com·1d·
Discuss: Hacker News
Flash Attention
Flag this post
Dual-format attentional template during preparation in human visual cortex
elifesciences.org·3d
🧩Attention Kernels
Flag this post
An underqualified reading list about the transformer architecture
fvictorio.github.io·2d·
Discuss: Hacker News
🧩Attention Kernels
Flag this post
**Breaking the Curse of Dimensionality: A Game-Changer for L
dev.to·1d·
Discuss: DEV
🧩Attention Kernels
Flag this post
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
arxiv.org·3d
🧩Attention Kernels
Flag this post
🧠 Soft Architecture (Part B): Emotional Timers and the Code of Care (Part 5 of the SaijinOS series)
dev.to·6h·
Discuss: DEV
🤖AI Coding Tools
Flag this post
After distractions, rotating brain waves may help thought circle back to the task
medicalxpress.com·1d
Flash Attention
Flag this post
Brumby-14B-Base: The Strongest Attention-Free Base Model
manifestai.com·2d·
Discuss: Hacker News
🏎️TensorRT
Flag this post
Emergent introspective awareness in large language models
transformer-circuits.pub·1d·
Discuss: Hacker News
Flash Attention
Flag this post